Showing 120 of 120on this page. Filters & sort apply to loaded results; URL updates for sharing.120 of 120 on this page
U-MATH & μ-MATH: New university-level math benchmarks challenge LLMs
Detail statistics of all the benchmarks dataset for video description ...
[PDF] Measuring Mathematical Problem Solving With the MATH Dataset ...
OpenR1-Math-220k Dataset Explained: Advancing AI Math Models ...
Math Benchmarks - a cogwheelhead Collection
University-level Math Reasoning Dataset - Toloka
Big-Math: A Large-Scale, High-Quality Math Dataset for Reinforcement ...
Reproducibility benchmarks for dataset 4. (A) Pearson correlations and ...
U-MATH & μ-MATH: new university-level math benchmarks challenge LLMs
GitHub - Paula-X/SmallDataBenchmarks: Small Dataset Benchmarks on the ...
AI learns math reasoning by playing Snake and Tetris-like games rather ...
All Math Benchmark Datasets - a Quadyun Collection
DISTRICT MATH BENCHMARKS: SPRING PARTICIPATION AND PERFORMANCE - OUSD Data
UTMath: Math Evaluation with Unit Test via Reasoning-to-Coding Thoughts
MMR1-Math-v0-7B Model and MMR1-Math-RL-Data-v0 Dataset Released: New ...
Hugging Face Releases FineMath: The Ultimate Open Math Pre-Training ...
MathVista: Evaluating Math Reasoning in Visual Contexts
Measuring Multimodal Mathematical Reasoning with MATH-Vision Dataset
Math Benchmarks: What are they and how do I use them? - The Primary Gal
Math Benchmarks: How to Help Your Students Meet Them - Rocket Math
What is a Benchmark? Math Definition, Facts, Examples & Quiz
A Benchmark Dataset for Graph Regression with Homogeneous and Multi ...
30 LLM evaluation benchmarks and how they work
Math Benchmark Data Breakdown Digital and Printable Versions Any Grade ...
(PDF) HARDMath: A Benchmark Dataset for Challenging Problems in Applied ...
Delivering 1000+ HLE-Grade Math Prompts to Benchmark SOTA Models
Description of Benchmarks Datasets | Download Scientific Diagram
The Most Comprehensive Large Model Dataset Sharing: Part 1, Mathematics ...
Details of benchmark dataset used in the experiment. | Download ...
14 Popular LLM Benchmarks to Know in 2025
Dataset statistics of graph-level benchmarks. | Download Scientific Diagram
Math Benchmark Test for Student Growth SGO | Made By Teachers
Overview of the Benchmark Dataset | Download Scientific Diagram
Benchmark Numbers Math Anchor Chart Poster by That One Cheerful Classroom
NVIDIA AI Research Introduce OpenMathInstruct-1: A Math Instruction ...
Performance benchmarks on gold-standard datasets. To test our model and ...
Deciphering the Math in Images: How the New MathVista Benchmark is ...
IsoBench: An Artificial Intelligence Benchmark Dataset Containing ...
[2402.14804] Measuring Multimodal Mathematical Reasoning with MATH ...
NeurIPS 2022 Tenrec A Large Scale Multipurpose Benchmark Dataset For ...
DeepScoresV2 Dataset Benchmark | PDF
Evaluation results from various methods on the benchmark dataset ...
LLM MATH benchmark
The benchmark dataset summarizations. For each dataset, we pick a ...
Simulated dataset for benchmark | Download Scientific Diagram
GitHub - Toloka/u-math: Official evaluation code for the U-MATH and μ ...
hkust-nlp/dart-math-pool-gsm8k · Datasets at Hugging Face
Toward Generalizable Evaluation in the LLM Era: A Survey Beyond ...
LLM Benchmark Evaluation - Apertus-8B | DS-NLP Lab
Statistics of 4 benchmark datasets in terms of samples distribution in ...
Comparison on the benchmark data sets described in Table 1. The ...
Microsoft’s rStar-Math Framework Lets Small AI Models Outperform OpenAI ...
一文彻底搞懂大模型 - 基准测试(Benchmark)_大模型benchmark-CSDN博客
Several benchmark data sets from UCI machine learning repository [33 ...
Top Multimodal Benchmark Datasets
Experimental results on three benchmark datasets | Download Scientific ...
What are LLM Benchmarks?
Statistics of the seven benchmark datasets. | Download Table
The description of benchmark datasets | Download Scientific Diagram
Benchmark datasets overview.|M * | stands for the size of the reference ...
nlile/hendrycks-MATH-benchmark · Datasets at Hugging Face
Details of the six benchmark datasets used in this work | Download ...
Introducing Epoch AI's AI benchmarking hub | Epoch AI
Statistics of benchmark datasets | Download Scientific Diagram
Performance comparison of benchmark datasets | Download Scientific Diagram
Statistics of the four benchmark datasets. | Download Scientific Diagram
Statistics of the Five Benchmark Datasets | Download Scientific Diagram
Summary of benchmark datasets for evaluating methods for differential ...
Summary of the benchmark datasets. | Download Table
Overview of four benchmark datasets. | Download Scientific Diagram
Benchmark datasets and their statistics | Download Scientific Diagram
GitHub - shiwk24/MathCanvas: This is the official repository for the ...
Statistics of the five benchmark datasets.. | Download Scientific Diagram
The benchmark datasets ordered by the number of formal concepts ...
A summary of the benchmark datasets. For each dataset, we report the ...
Number of examples and dimensions of each of the 9 benchmark datasets ...
Statistics of the benchmark dataset. | Download Scientific Diagram
Overview of the benchmark datasets. | Download Scientific Diagram
A quantitative comparison of three real-world benchmark datasets and ...
Different methods are more intuitively compared on the six benchmark ...
1: Statistics of the benchmark datasets | Download Table
The summary of benchmark datasets | Download Scientific Diagram
Statistical performance comparison on benchmark datasets | Download Table
Summary of the benchmark datasets | Download Scientific Diagram
Specifics of the benchmark dataset. | Download Scientific Diagram
Statistics of Different Benchmark Datasets used in the Literature to ...
Details of datasets being evaluated. Math: arithmetic reasoning. CS ...
Summary of the benchmark datasets. | Download Scientific Diagram
Statistics of our benchmark dataset. We count the number of documents ...
The statistics of fifteen benchmark datasets. | Download Scientific Diagram
Description of the ten benchmark datasets. | Download Scientific Diagram
Characteristics of four benchmark datasets. | Download Scientific Diagram
The information of 12 benchmark datasets. | Download Scientific Diagram
Statistics of the benchmark datasets | Download Scientific Diagram
Benchmark datasets demonstration. | Download Scientific Diagram
Details of the benchmark datasets | Download Scientific Diagram
The details of the benchmark datasets. | Download Scientific Diagram
Description of the benchmark datasets. | Download Scientific Diagram
Description of the 5 benchmark datasets with their characteristics ...
The detailed statistics of four standard benchmark datasets. |R | and ...
Comparison of different methods on six benchmark datasets. The curve ...
Detailed information of the real benchmark datasets. | Download ...
Description of benchmark datasets | Download Scientific Diagram
Statistics of benchmark datasets. | Download Scientific Diagram
The Most Comprehensive Sharing for Reasoning Dataset: CoT - Related ...
Details of Benchmark dataset. | Download Table
Benchmark datasets. The key features of the four empirical and one ...
The statistics of five benchmark datasets. | Download Scientific Diagram